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Amendments to the Claims 

This listing of claims replaces all prior versions, and listings, of claims in the above- 
identified application: 

1 . (Currently Amended) A method for use in sequence data analysis comprising: 

providing a multiple sequence alignment of a plurality of sequences, wherein the multiple 
sequence aligimient comprises a column of aligned amino acids and/or gaps for each horizontal 
position of the multiple sequence alignment; 

providing a plurality of numerical physical-chemical property (PCP) descriptors for each 
amino acid based on a plurality of physical-chemical properties thereof, wherein each of the 
plurality of numerical PCP descriptors corresponds to one of "N" eigenvectors used in defining 
the amino acids in terms of physical-chemical properties; 

describing each amino acid in the multiple sequence alignment quantitatively in terms of 
the plurality of PCP descriptors as a series of "N" eigenvectors resulting in "N" PCP described 
sequence alignments, wherein each PCP described sequence alignment corresponds to and is 
defined with numerical PCP descriptors which correspond to one of the "N" eigenvectors, and 
further wherein each PCP described sequence alignment comprises a plurality of columns 
corresponding to the columns of the multiple sequence alignment; 

analyzing each of the PCP described sequence alignments, on a column by column basis, 
to generate conservation property data for each column, wherein the conservation property data 
for each column comprises an average value for the numerical PCP descriptors in the column 
and a standard deviation associated with the average value, and a relative entropy value for the 
column; 

analyzing the conservation property data for each of the PCP described sequence 
alignments to detect consecutive horizontal positions of the multiple sequence alignment where 
the physical-chemical properties are conserved based on at least the relative entropy determined 
for each column; and 
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defining one or more PCP motifs in the multiple sequence alignment based at least on the 
detection of consecutive horizontal positions of the multiple sequence alignment where the 
physical-chemical properties are conserved according to at least one eigenvector : and 

using the one or more PCP motifs to search a sequence database resulting in 
identification of one or more related sequence segments having PCP characteristics similar to 
one or more of the PCP motifs . 

2. (Previously presented) The method of claim 1 , wherein analyzing the conservation 
property data for each of the PCP described sequence alignments comprises analyzing the 
conservation property data for each of the PCP described sequence alignments to detect 
consecutive horizontal positions where the relative entropy satisfies a predetermined limit. 

3. (Previously presented) The method of claim 1 , wherein defining one or more PCP motifs 
in the multiple sequence alignment further comprises using user specified gap and minimum 
length limits to define the one or more PCP motifs, wherein each PCP motif comprises a 
plurality of consecutive horizontal positions in the multiple sequence alignment. 

4. (Cancelled) Th e method of claim 1 , further wherein t he method comprises using t h e one 
or more PCP mo t ifs t o s e arch a s e quenc e databas e for r e lat e d sequence s e gm e nts having PCP 
characteris t ics similar to on e or more of t he PCP mo t ifs. 

5. (Previously presented) The method of claim 4, wherein each PCP motif comprises a 
plurality of consecutive horizontal positions in the multiple sequence alignment, wherein using 
the one or more PCP motifs to search a sequence database for related sequence segments 
comprises defining each of the PCP motifs as a series of PCP motif profile matrices, wherein 
each PCP motif profile matrix of the series corresponds to one of the "N" eigenvectors, and 
fiirther wherein values for each PCP motif profile matrix comprise an average value of the 
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numerical PCP descriptors in the column at each horizontal position of the PCP motif and a 
standard deviation associated with the average value, and a relative entropy value for each 
horizontal position of the PCP motif 

6. (Previously presented) The method of claim 5, wherein using the one or more PCP 
motifs to search a sequence database for related sequence segments comprises: 

converting each of one or more sequences of the sequence database to a searchable form 
using the numerical PCP descriptors; 

using a positional scoring function to match values of the series of PCP motif profile 
matrices defined for each PCP motif to segments of each of the searchable matrices resulting in 
scored segments; and 

selecting at least one scored segment for each of the searchable matrices as being a best 
match to each PCP motif based on results of the positional scoring function. 

7. (Previously presented) The method of claim 6, wherein each of the selected scored 
segments forms a part of one of a plurality of proteins of the sequence database, and wherein the 
method further comprises ranking the plurality of proteins according to which protein has PCP 
characteristics that are the closest to the plurality of sequences used to provide the multiple 
sequence alignment. 

8. (Previously presented) The method of claim 7, wherein ranking the plurality of proteins 
comprises ranking one or more of the plurality of proteins based on application of a Bayesian 
scoring function. 

9. (Previously presented) The method of claim 7, wherein ranking the plurality of proteins 
further comprises ranking one or more of the plurality of proteins based on structural similarity. 
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10. (Previously presented) The method of claim 7, wherein ranking the plurality of proteins 

comprises: 

determining an overall PCP similarity distance score associated with each of the one or 
more proteins of the sequence database; and 

ranking the one or more proteins of the sequence database based on the overall PCP 
similarity scores for the proteins and relative to what a random score for the proteins would be. 

1 1 . (Previously presented) The method of claim 6, wherein each of the selected scored 
segments fonns a part of one of a plurality of proteins of the sequence database, and wherein the 
method further comprises: 

providing structural data for the one or more selected sequence segments; 
providing query structural data related to the PCP motifs; 

calculating segmental root mean square deviation between the query structural data and 
the structural data for the one or more selected sequence segments; and 

ranking the one or more proteins of the sequence database based on the calculated 
segmental root mean square deviation. 

12. (Previously presented) A computer program for use in conjunction with a processing 
apparatus to analyze sequence data, wherein the computer program is operable when used with 
the processing apparatus to: 

recognize a multiple sequence alignment of a plurality of sequences, wherein the multiple 
sequence alignment comprises a column of aligned amino acids and/or gaps for each horizontal 
position of the multiple sequence alignment; 

recognize a plurality of numerical physical-chemical property (PCP) descriptors for each 
amino acid based on a plurality of physical-chemical properties thereof, wherein each of the 
plurality of numerical PCP descriptors corresponds to one of "N" eigenvectors used in defining 
the amino acids in terms of physical-chemical properties; 
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describe each amino acid in the multiple sequence alignment quantitatively in terms of 
the plurality of PCP descriptors as a series of "N" eigenvectors resulting in "N" PCP described 
sequence alignments, wherein each PCP described sequence alignment corresponds to and is 
defined with numerical PCP descriptors which correspond to one of the "N" eigenvectors, and 
further wherein each PCP described sequence alignment comprises a plurality of columns 
corresponding to the columns of the multiple sequence alignment; 

analyze each of the PCP described sequence alignments, on a column by column basis, to 
generate conservation property data for each colunm, wherein the conservation property data for 
each column comprises an average value for the numerical PCP descriptors in the column and a 
standard deviation associated with the average value, and a relative entropy value for the 
column; 

analyze the conservation property data for each of the PCP described sequence 
alignments to detect consecutive horizontal positions of the multiple sequence alignment where 
the physical-chemical properties are conserved based on at least the relative entropy determined 
for each column; and 

define one or more PCP motifs in the multiple sequence alignment based at least on the 
detection of consecutive horizontal positions of the multiple sequence alignment where the 
physical-chemical properties are conserved according to at least one eigenvector , wherein the 
one or more PCP motifs are usable to search a sequence database resulting in the identification 
of one or more related sequence segments having PCP characteristics similar to one or more of 
the PCP motifs . 

13. (Previously presented) The computer program of claim 1 2, wherein the computer 
program is operable when used with the processing apparatus to analyze the conservation 
property data for each of the PCP described sequence alignments by analyzing the conservafion 
property data for each of the PCP described sequence alignments to detect consecutive 
horizontal positions where the relative entropy satisfies a predetermined limit. 



Amendment and Response Page 9 of 20 

Serial No.: 10/817,530 
Confirmation No.: 4868 
Filed: April 2, 2004 

For: PHYSICAL-CHEMICAL PROPERTY BASED SEQUENCE MOTIFS AND METHODS REGARDING 
SAME 

14. (Previously presented) The computer program of claim 12, wherein the computer 
program is operable when used with the processing apparatus to define one or more PGP motifs 
in the multiple sequence alignment using user specified gap and minimum length limits to define 
the one or more PCP motifs, wherein each PCP motif comprises a plurality of consecutive 
horizontal positions in the multiple sequence alignment. 

15. (Previously presented) The computer program of claim 1 2, wherein the computer 
program is fiirther operable when used with the processing apparatus to use the one or more PCP 

motifs to search a sequence database for related sequence segments having PCP characteristics 
similar to one or more of the PCP motifs. 

16. (Previously presented) The computer program of claim 15, wherein each PCP motif 
comprises a plurality of consecutive horizontal positions in the multiple sequence alignment, and 
wherein the computer program is further operable when used with the processing apparatus to 
define each of the PCP motifs as a series of PCP motif profile matrices, wherein each PCP motif 
profile matrix of the series corresponds to one of the "N" eigenvectors, and further wherein 
values for each PCP motif profile matrix comprise an average value of the numerical PCP 
descriptors in the column at each horizontal position of the PCP motif and a standard deviation 
associated with the average value, and a relative entropy value for each horizontal position of the 
PCP motif 

1 7. (Previously presented) The computer program of claim 1 6, wherein the computer 
program is further operable when used with the processing apparatus to: 

convert each of one or more sequences of the sequence database to a searchable form 
using the numerical PCP descriptors; 



Amendment and Response Page 1 0 of 20 

Serial No.: 10/817,530 
Confirmation No.: 4868 
Filed: April 2, 2004 

For: PHYSICAL-CHEMICAL PROPERTY BASED SEQUENCE MOTIFS AND METHODS REGARDING 
SAME 

use a positional scoring function to match values of the series of PCP motif profile 
matrices defined for each PCP motif to segments of each of the searchable matrices resulting in 
scored segments; and 

select at least one scored segment for each of the searchable matrices as being a best 
match to each PCP motif based on results of the positional scoring function. 

1 8. (Previously presented) The computer program of claim 17, wherein each of the selected 
scored segments forms a part of one of a plurality of proteins of the sequence database, and 
wherein the computer program is further operable when used with the processing apparatus to 
rank one or more of the plurality of proteins according to which protein has PCP characteristics 
that are the closest to the plurality of sequences used to provide the multiple sequence alignment. 

1 9. (Previously presented) The computer program of claim 1 8, wherein the computer 
program is operable when used with the processing apparatus to rank one or more of the 
plurality of proteins based on application of a Bayesian scoring function. 

20. (Previously presented) The computer program of claim 1 8, wherein the computer 
program is operable when used with the processing apparatus to rank one or more of the 
plurality of proteins based on structural similarity. 

21 . (Previously presented) The computer program of claim 1 8, wherein the computer 
program is operable when used with the processing apparatus to: 

determine an overall PCP similarity distance score associated with each of the one or 
more proteins of the sequence database; and 

rank the one or more proteins of the sequence database based on the overall PCP 
similarity scores for the proteins and relative to what a random score for the proteins would be. 
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22. (Previously presented) The method of claim 17, wherein each of the selected scored 
segments forms a part of one of a plurality of proteins of the sequence database, and wherein the 
computer program is operable when used with the processing apparatus to: 

recognize structural data for the one or more selected sequence segments; 

recognize query structural data related to the PCP motifs; 

calculate segmental root mean square deviation between the query structural data and the 
structural data for the one or more selected sequence segments; and 

rank the one or more proteins of the sequence database based on the calculated segmental 
root mean square deviation. 



